Computer and Modernization ›› 2012, Vol. 203 ›› Issue (7): 120-123.doi: 10.3969/j.issn.1006-2475.2012.07.032

• 网络与通信 • Previous Articles     Next Articles

Research on E-mail Classification Based on Dynamic Characteristics Library

MU Jun-peng1, DONG Kui-feng2, ZHANG Ming2

  

  1. 1. Shanghai Publishing and Printing College, Shanghai 200093, China;2. College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
  • Received:2012-02-24 Revised:1900-01-01 Online:2012-08-10 Published:2012-08-10

Abstract: With the development of E-mail classification technology, it needs to extract from the constantly E-mail features, so as to improve the organization and management of the message category more effective, according to changing characteristics. This article resolves the problem from the aspects of the message’s dynamic characteristics, by using the mail client software, using the ICTCLAS tool to realize Chinese word segmentation, and using the improved TF-IDF algorithm to calculate the mail feature weighting, and also using the WEKA mining tool to examine the result with the simulation experiment. The experimental results show that, by using the dynamic characteristics in a mail message, the realization of changing characteristics in mail classification is feasible, and to a certain extent, this method is more reasonable and effective.

Key words: dynamic characteristics, mail classification, Chinese word segmentation, TF-IDF, WEKA, data mining

CLC Number: